updated: 2022-01-23_12:32:31-05:00


Non-Context Free Languages

Pumping Lemma

Introduction and Givens

let G be a CFG for a CFL L
let b be the maximum number of terminals on the RHS of a rule
let w be the string derived by G
let v be the number of variables

w = w1, w2, w3...

S => ... => ... => ... => w
^ length (number of derivations) is k steps long

At every step in derivation, we can get at most b new terminals

Example
S > aTb
T > aU
U > bV
V > baV | e
S => aTb => aaUb => aabVb => aabbaVb => aabbab
Number of derivations is 5. 5 is more than 4 (variables)
\therefore one variable must loop

length of w for derivation of length k:
wkb|w|\leq kb
kbwkb\geq |w|
kwbk \geq \frac{|w|}{b}

A derivation of length k contains

  1. How many variables if repetition is allowed?
  2. How many variables if repetition is not allowed?

If we pick k > |v| then at least two intermediate steps will contain a variable
i.e. A variable will be repeated in derivation

To ensure this repetition we choose w such that kvk \geq |v|

  • wbv\frac{|w|}{b}\geq |v|
  • wbv|w|\geq b\star |v|
    • Choosing this guarantees a repetition of a variable in the derivation

$S \stackrel{}{\Rightarrow} uAz \stackrel{}{\Rightarrow} vAy\stackrel{\star}{\Rightarrow} w$

A is the variable that is repeating

wbv|w|\geq b\star |v|

S{ u A{ v A{ x } y } z... }

If you have a grammar like this, we can get no repetitions, one repetitions, or many repetitions

What does this mean?

uvkxykzLuv^{k}xy^{k}z\in L for k \geq 0
if k = 0 -> uxz
if k = 1 -> uvxyz
if k = 2 -> uv2^2xy2^2z

We want to come up with a string that has has repeats

Theorem

  • If L is a CFL, then there is a number P > 0 called pumping length such that if w\inL and |w|\geqp, then w can be written as uvxyz where:
    1. |vxy| \leq P
    2. vy \neq e
    3. uvK^{K}xyK^{K}z \in L for all k \geq 0

Proof:
Let L be a CFL and let G be the CFG that generates L
Let b be the maximum number of symbols on RHS of any rule
V is the number of variables
Let p = max(bV^{|V|}+1,bV+1^{|V|+1})
Parse tree:
Number of steps is the same as the height of the parse tree
max length of string is bh^h
consider the smallest possible parse tree for w
Subtree w:
Parse tree
because |w|\geq bv^{|v|} the height of the tree must be greater than |v|
Let π\pi be one of the longest paths in the tree, length of π\pi > |v|
this means π\pi contains repetition among the bottom |v| + 1 variables
then w = uvxyz is generated by repeating the portion of the tree that denotes "VAY"
parse tree can be constructed for every string of form uvK^{K}xyK^{K}z \in L for all k \geq 0
height >= |V| implies repetition

Example: a^n b^n c^n

L = {anbncnn0a^nb^nc^n|n\geq0}
Suppose L is CFL
let p be the pumping length provided by PL
consider wLw\in L and w=apbpcpw = a^pb^pc^p
clearly wp|w|\geq p
According to PL, w can be represented as uvxyz where:
1. |vxy| \leq P
2. vy \neq e
3. uvK^{K}xyK^{K}z \in L for all k \geq 0

vxy could be anywhere in the string. Can't say it's all a's or b's because of that. There are multiple cases

Two cases:

  1. v or y contain more than one kind of symbol (2 symbols, either a & b or b & c)
    • let k = 2
    • uv2xy2zuv^2xy^2z
      • v or y have 2 characters, so repeating them means they are out of order when power 2 applies
      • so if v x y was aa a ab => aaaa a abab
      • now there is an a after a b...
    • also, the number of b's and c's is different
  2. v & y each contain one type of symbol
    • uv2xy2zuv^2xy^2z
    • Options:
      • v is all a's; y is all b's
        • ap+vbp+ycpLa^{p+|v|}b^{p+|y|}c^{p}\notin L
      • v is all b's; y is all c's
        • apbp+vcp+yLa^pb^{p+|v|}c^{p+|y|}\notin L
          One of these cases must occur, \therefore a contradiction must occur

Example: ww

L = {ww} (not a palindrome)

bla bla proof stuff

  1. |vxy| \leq P
  2. vy \neq e
  3. uvK^{K}xyK^{K}z \in L for all k \geq 0

let w = 0p1p0p1p0^p1^p0^p1^p
w consists of 4 blocks of length p

| vxy | \leq p => vxy can can touch at most two blocks

Case 1

suppose that v and y are both contained within a single block (same symbol)

say block 1. ie: vxy is all 0's

let k = 2

uv2xy2zuv^2xy^2z => 0p+i1p0p1p0^{p+i}1^p0^p1^p for i > 0

uv2xy2zLuv^2xy^{2}z\notin L

if v & y are all 1's:
0p1p+i0p1p0^p1^{p+i}0^p1^p is also \notin L

Now, for HW, look at multiple cases

Case 2

suppose v & y each have two different symbols

0's and 1's mixed up

uv2xy2zLuv^2xy^2z\notin L because the 0's and 1's are not in order

therefore not in the language, and since neither is in language, not a CFL

...

Example: L = {a^i b^j c^k | 0 <= i <= j <= k}

Let L be a CFL and let p be the pumping length given by a pumping length

w = ap^p bp^p cp^p

  • we are choosing this because we know a^n b^n c^n not cfl, but a little different since we will have to pump up due to the less than

|w| \geq p

therefore according to pumping lemma w can be represented as uvxyz where:

  1. |vxy| \leq P
  2. vy \neq e
  3. uvK^{K}xyK^{K}z \in L for all k \geq 0

Case 1

Both v and y contain the same symbol
three subcases.. can't be generalized

  1. v & y are in the middle (all b's)
    • b^p => vxy
    • pumping down ... apbpmcpa^{p}b^{p-|m|}c^{p}
    • |a| > |b|
    • soooo
    • uv0xy0zLuv^{0}xy^{0}z\notin L because |a| > |b|
  2. v and y are in the 1st part of the L (all a's)
    • pumping down won't work, so we need to pump up
    • Pumping up
      • k = 2
      • ap+mbpcpla^{p+|m|}b^{p}c^{p}\notin l
  3. v and y are in the last part of L (all C's)
    • pump down
      • apbpcp=mLa^{p}b^{p}c^{p=|m|}\notin L

Case 2

both v and y contain different symbols
can be generalized:

  1. v is all a's and some b's ; y is all b's
  2. v is all b's and some c's; y is all c's
    OR
  3. v is some a's
  4. b is some a's and some b's

you can say "either v or y contain 2 different symbols" (thanks Nick!)

either way, the order is going to be wrong

  • like if v = aaabbb, v^2 is aaabbbbaaabbb
    • now a's are after b's

Pump up: uv2xy2zLuv^{2}xy^{2}z \notin L

blabla, therefore not in set of CFL etc


Homework by Friday:

(from book)
2.30:
a) {0n^n 1n^n 0n^n 1n^n | n \ge 0} is not a CFL
d) {t1_1 # t2_2 # ... # tk_k | k \ge 2, each ti_i \in {a,b}^{\star} and ti_i = tj_j for some i \ne j}
2.33:
show that f = {ai^{i} bj^{j} | i = kj for some positive integer k} is not a CFL


Example: L = {1^i # 1^j # 1^{i+j} | i <= j}

" CFL Pumping lemma stuff

choose w = 1p^{p} # 1p^{p} # 12p^{2p}
uvxyz

Case 1

either v or y contains #

  • uv2^{2}xy2^{2}z => 1p^{p}##1p^{p}#12pL^{2p}\notin L
  • uv0^{0}xy0^{0}z => 1p^{p}1p^{p} # 12pL^{2p}\notin L

^^ one is pumping up, one is down, we don't need to do both, just show one

She is reexplaining this case...
uv2^{2}xy2^{2}z => #1#1#1^p ... too many hashes
probably ignore this to not get confused

Case 2

neither v or y contain #

v & y could be first, second, or third block of the string (like between the #)

they will fall within a block

  1. first block (v & y are in first block)
    • length of vxyp|vxy| \le p, \therefore v and y are in 1st block
    • soo, we should pump up
    • uv2^{2}xy2^{2}z\notin L because i > j
  2. second block (v & y are in second block)
    • pump down
    • uv0^{0}xy0^{0}z \notin L bc i>j
  3. third block (v & y are in third block)
    • pump up
    • uv2^{2}xy2^{2}z => 1p^{p}#1p^{p}#2p+mL^{2p+|m|}\notin L
    • p+p 2p+m\neq 2p + |m|

to be continued